Skip to content

feat: centralize dynamic env var construction in config.py, remove composite vars from .env#1105

Merged
iziang merged 2 commits into
mainfrom
support/env
Jul 13, 2025
Merged

feat: centralize dynamic env var construction in config.py, remove composite vars from .env#1105
iziang merged 2 commits into
mainfrom
support/env

Conversation

@iziang
Copy link
Copy Markdown
Contributor

@iziang iziang commented Jul 13, 2025

  • Added PostgreSQL, Redis, and Elasticsearch configuration fields to Config class in config.py.
  • Updated env.template to reflect new environment variables for Redis and Elasticsearch.
  • Removed redundant URL construction from entrypoint.sh, as URLs are now built within the Config class.
  • Adjusted start-celery-flower.sh to simplify Celery command execution.

- Added PostgreSQL, Redis, and Elasticsearch configuration fields to `Config` class in `config.py`.
- Updated `env.template` to reflect new environment variables for Redis and Elasticsearch.
- Removed redundant URL construction from `entrypoint.sh`, as URLs are now built within the `Config` class.
- Adjusted `start-celery-flower.sh` to simplify Celery command execution.
@apecloud-bot apecloud-bot added the size/L Denotes a PR that changes 100-499 lines. label Jul 13, 2025
@iziang iziang changed the title feat: enhance configuration management for database and services feat: centralize dynamic env var construction in config.py, remove composite vars from .env Jul 13, 2025
cursor[bot]

This comment was marked as outdated.

@iziang iziang merged commit 6d2fcb7 into main Jul 13, 2025
6 of 7 checks passed
@iziang iziang deleted the support/env branch July 13, 2025 08:01
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Redis URL Issues and Database Default Change

The commit introduces two issues:

  1. Redis URL Malformation: The CELERY_BROKER_URL and MEMORY_REDIS_URL are malformed when redis_user or redis_password are empty strings (e.g., redis://:@host:port/0). The URL construction does not properly handle empty credentials, unlike the ES_HOST construction.
  2. Breaking Change in Default Database: The default DATABASE_URL behavior changed from SQLite to PostgreSQL. Previously, it defaulted to sqlite:///{BASE_DIR}/db.sqlite3. Now, if not explicitly set, it defaults to None and is then constructed as a PostgreSQL URL using postgres_* fields. This breaks backward compatibility for existing deployments relying on the SQLite default.

aperag/config.py#L71-L189

ApeRAG/aperag/config.py

Lines 71 to 189 in 860985c

# Database
database_url: Optional[str] = Field(None, alias="DATABASE_URL")
# Database connection pool settings
db_pool_size: int = Field(20, alias="DB_POOL_SIZE")
db_max_overflow: int = Field(40, alias="DB_MAX_OVERFLOW")
db_pool_timeout: int = Field(60, alias="DB_POOL_TIMEOUT")
db_pool_recycle: int = Field(3600, alias="DB_POOL_RECYCLE")
db_pool_pre_ping: bool = Field(True, alias="DB_POOL_PRE_PING")
# Auth
auth_type: str = Field("none", alias="AUTH_TYPE")
auth0_domain: str = Field("aperag-dev.auting.cn", alias="AUTH0_DOMAIN")
auth0_client_id: str = Field("", alias="AUTH0_CLIENT_ID")
authing_domain: str = Field("aperag.authing.cn", alias="AUTHING_DOMAIN")
authing_app_id: str = Field("", alias="AUTHING_APP_ID")
logto_domain: str = Field("aperag.authing.cn", alias="LOGTO_DOMAIN")
logto_app_id: str = Field("", alias="LOGTO_APP_ID")
# Celery
celery_broker_url: Optional[str] = Field(None, alias="CELERY_BROKER_URL")
celery_result_backend: Optional[str] = None # Will be set in __post_init__
celery_beat_scheduler: str = "django_celery_beat.schedulers:DatabaseScheduler"
celery_worker_send_task_events: bool = True
celery_task_send_sent_event: bool = True
celery_task_track_started: bool = True
local_queue_name: str = Field("", alias="LOCAL_QUEUE_NAME")
# Model configs
model_configs: Dict[str, Any] = {}
# Embedding
embedding_max_chunks_in_batch: int = Field(10, alias="EMBEDDING_MAX_CHUNKS_IN_BATCH")
# Memory backend
memory_redis_url: Optional[str] = Field(None, alias="MEMORY_REDIS_URL")
# Vector DB
vector_db_type: str = Field("qdrant", alias="VECTOR_DB_TYPE")
vector_db_context: str = Field(
'{"url":"http://localhost", "port":6333, "distance":"Cosine"}', alias="VECTOR_DB_CONTEXT"
)
# Object store
object_store_type: str = Field("local", alias="OBJECT_STORE_TYPE")
object_store_local_config: Optional[LocalObjectStoreConfig] = None
object_store_s3_config: Optional[S3Config] = None
# Limits
max_bot_count: int = Field(10, alias="MAX_BOT_COUNT")
max_collection_count: int = Field(50, alias="MAX_COLLECTION_COUNT")
max_document_count: int = Field(1000, alias="MAX_DOCUMENT_COUNT")
max_document_size: int = Field(100 * 1024 * 1024, alias="MAX_DOCUMENT_SIZE")
max_conversation_count: int = Field(100, alias="MAX_CONVERSATION_COUNT")
# Chunking
chunk_size: int = Field(400, alias="CHUNK_SIZE")
chunk_overlap_size: int = Field(20, alias="CHUNK_OVERLAP_SIZE")
# Fulltext search
es_host: Optional[str] = Field(None, alias="ES_HOST")
es_timeout: int = Field(30, alias="ES_TIMEOUT") # ES request timeout in seconds
es_max_retries: int = Field(3, alias="ES_MAX_RETRIES") # Max retries for ES requests
# Qianfan
qianfan_api_key: str = Field("", alias="QIANFAN_API_KEY")
qianfan_secret_key: str = Field("", alias="QIANFAN_SECRET_KEY")
# OCR/ASR
whisper_host: str = Field("", alias="WHISPER_HOST")
paddleocr_host: str = Field("", alias="PADDLEOCR_HOST")
docray_host: str = Field("", alias="DOCRAY_HOST")
# Register mode
register_mode: str = Field("unlimited", alias="REGISTER_MODE")
# Cache
cache_enabled: bool = Field(True, alias="CACHE_ENABLED")
cache_ttl: int = Field(86400, alias="CACHE_TTL")
# Opik
opik_api_key: str = Field("", alias="OPIK_API_KEY")
opik_workspace: str = Field("", alias="OPIK_WORKSPACE")
def __init__(self, **kwargs):
super().__init__(**kwargs)
# Load model configs from file
import json
import os
json_path = os.path.join(BASE_DIR, "model_configs.json")
if os.path.exists(json_path):
with open(json_path, "r", encoding="utf-8") as f:
self.model_configs = json.load(f)
# DATABASE_URL
if not self.database_url:
self.database_url = (
f"postgresql://{self.postgres_user}:{self.postgres_password}"
f"@{self.postgres_host}:{self.postgres_port}/{self.postgres_db}"
)
# CELERY_BROKER_URL
if not self.celery_broker_url:
self.celery_broker_url = (
f"redis://{self.redis_user}:{self.redis_password}"
f"@{self.redis_host}:{self.redis_port}/0"
)
# CELERY_RESULT_BACKEND
if not self.celery_result_backend:
self.celery_result_backend = self.celery_broker_url
# MEMORY_REDIS_URL
if not self.memory_redis_url:
self.memory_redis_url = (
f"redis://{self.redis_user}:{self.redis_password}"
f"@{self.redis_host}:{self.redis_port}/1"
)

Fix in CursorFix in Web


Was this report helpful? Give feedback by reacting with 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants